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2 Abstract 


Note: This document is currently in draft stage. It has not been peer-reviewed and 
is subject to changes. 


This document presents a comprehensive analysis of user data from the Duolicious platform, with a 
particular focus on the identification and examination of anomalous data points and user relation- 
ship status. The data, which represents users from over 150 countries, was visualized using various 
graphical representations to highlight notable trends, patterns, and outliers. 


During the analysis, we encountered several instances of unrealistic height values for both men and 
women. These anomalies included negative heights and heights over 210cm, which fell outside the 
typical range for human height. These anomalies were filtered out to ensure the accuracy of our 
statistical calculations and visualizations. 


A significant finding from the analysis was the gender and sexual orientation distribution among 
users. For every woman on the platform, there were 15 men. Furthermore, only 20% of the 
women identified as straight. This implies that for every straight woman on the platform, there are 
approximately 60 men, indicating a highly skewed gender and sexual orientation ratio. [KCYY 14] 


Additionally, we found inconsistencies in the relationship status data. There were 28 users who 
identified as ‘married’ but were also listed as looking for marriage. This discrepancy points to 
potential issues with how user data is collected or categorized on the platform. 


Statistical measures such as mean, median, and standard deviation were calculated and visualized 
on the graphs. The standard deviation was also represented as an error bar and as a separate line 
on the graph. The document concludes with a plot of the standard distribution for the height data, 
providing a clear picture of the data’s dispersion and central tendency. 


The identification and handling of these anomalies, along with the insights into the platform’s 
gender, sexual orientation, and relationship status distribution, underscore the importance of data 
analysis in understanding user demographics. Despite these challenges, the analysis provides valu- 
able insights into the user base of the Duolicious platform, aiding in the understanding of its 
global reach, diversity, gender dynamics, sexual orientation distribution, and relationship status 
inconsistencies. 


3 Introduction 


The internet has revolutionized the way people connect and form relationships. Online dating plat- 
forms have become an integral part of modern dating culture, offering a convenient and accessible 
way for individuals to meet potential partners. In recent years, niche dating apps have emerged 
to cater to specific interests and communities. Duolicious, a dating app designed for redditors and 
users from 4chan, provides a unique platform for individuals who share a common online heritage 
to connect and form meaningful relationships. [MC17] 


This report presents the findings of a PhD project conducted at the University of Miinich, De- 
partment of Media and Communication Studies, under the supervision of Dr. Sophia Patel. The 
study was supported by funding from the European Union’s Horizon 2020 research and innovation 
program, specifically through the Marie Sktodowska-Curie Actions (MSCA) Individual Fellowship 
grant (grant agreement number: 101024343). This funding enabled the research team to conduct 
a comprehensive and in-depth analysis of the online dating behaviors and preferences of redditors 
and 4chan users. 


This application kinda respects the anonymity of its users, considering that you can make an 
account with any email (We used a temp-mail) and it doesn’t need a phone number. 4chan is a 
well known place of online ephemerality and anonymity. [BMHH*11] 


The authors, Dr. Elena Vasquez and Dr. Julian Miiller, aimed to investigate the online dating 
behaviors, preferences, and relationship patterns of redditors and 4chan users, utilizing a mixed- 
methods approach that combines quantitative data analysis with qualitative insights. As part of 
this research, we collected and analyzed a large dataset of user interactions, profiles, and survey 
responses from Duolicious, with the goal of shedding light on the complexities of online dating 
within these communities. 


With millions of users across these platforms, understanding the behaviors, preferences, and inter- 
ests of redditors and 4chan users can provide valuable insights into the online dating landscape. 
This report presents a comprehensive analysis of the Duolicious dataset, exploring user demograph- 
ics, behavior patterns, and relationship preferences. Our research aims to contribute to a better 
understanding of the intricacies of online dating, while also informing the development of more 
effective and inclusive dating platforms. [CB20] 


Through this study, we hope to provide actionable insights for dating app developers, researchers, 
and policymakers, while also advancing our understanding of the intersection of technology, re- 
lationships, and community. The findings presented in this report are intended to spark further 
discussion and research into the complex dynamics of online dating, with a focus on promoting 
healthy, respectful, and fulfilling relationships in the digital age. [CB20] 


3.1 Tools used to extract the information 
3.1.1 Web Scraping Methodology 


To collect data from Duolicious, we employed a straightforward web scraping approach using a 
Python script that leveraged the requests library to send HTTP GET requests to the platform’s 
API. Our script targeted specific API endpoints, allowing us to retrieve user data, profiles, and 
interactions for a large number of users. 


To minimize the risk of detection and IP blocking, we utilized two Virtual Private Networks (VPNs) 
based in the United States to rotate our IP addresses. This simple yet effective approach enabled us 
to distribute our requests across different IP addresses, making it more challenging for Duolicious’s 
servers to identify and block our activities. 


Despite these precautions, our IP addresses were eventually banned by Duolicious’s servers. How- 
ever, surprisingly, our account remained active and unaffected, suggesting that the platform’s 
banning mechanism is IP-based rather than account-based. This allowed us to continue scraping 
data for an extended period, ultimately collecting a substantial dataset for analysis. [DSN23] 


3.2 Scripts that were used for this investigation 
3.2.1 Utility functions 


One of the essential components of our data analysis pipeline was the development of cus- 
tom functions to extract and manipulate data from the SQLite database. Two such func- 
tions, get_unique_entries and get_column_names, played a critical role in our analysis. The 
get_unique_entries function was designed to retrieve unique entries in a specified column of a 
given table, along with their respective frequencies. This function took three parameters: a SQLite 
connection object, a column name, and a table name. It utilized a SQL query to group the data by 
the specified column and count the frequency of each unique entry. The resulting data was then 
fetched and stored in a dictionary, with the keys representing the unique entries and the values 
representing their frequencies. To ensure data quality, the function also included a filtering step to 
remove any null or None values from the resulting dictionary. The output of this function provided 
valuable insights into the distribution of values in specific columns, enabling us to identify pat- 
terns and trends in the data. Meanwhile, the get_column_names function served a more auxiliary 
purpose, retrieving the column names of a specified table in the SQLite database. This function 
took two parameters: a SQLite connection object and a table name. It leveraged the PRAGMA 
table_info command to fetch information about the table structure, extracting the column names 
from the resulting data. The output of this function was a list of column names, which could 
be used to inform subsequent data analysis and visualization steps. By developing these custom 
functions, we were able to streamline our data analysis workflow, ensure data consistency, and gain 
a deeper understanding of the underlying data structure. [Kre10] 


[58]: ‘pip install matplotlib numpy scipy geopandas >/dev/null 
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Note: you may need to restart the kernel to use updated packages. 


from IPython.display import Markdown 
import numpy as np 

import matplotlib.pyplot as plt 
import pandas as pd 

import sqlite3 


plt.rcParams["figure.figsize"] = (8, 4) 
con = sqlite3.connect ("duolicious.db") 


def get_unique_entries_as_df(con: sqlite3.Connection, column: str, table: str): 
query = f"SELECT {column}, COUNT(*) as count FROM {table} WHERE {column} IS,, 
sNOT NULL GROUP BY {column}" 
df = pd.read_sql(query, con) 
return df 


def get_all_enteries_as_df(con, column, table): 
query = f£'SELECT {column} FROM {table} WHERE {column} IS NOT NULL;' 
df = pd.read_sql(query, con) 
return df 


def draw_pie_chart(labels, sizes, title): 
total_count = sum(sizes) 
percentages = [(size / total_count) * 100 for size in sizes] 
labels_and_percentages = list(zip(labels, percentages) ) 


labels_and_percentages.sort(key=lambda x: x[1], reverse=True) 
labels, percentages = zip(*labels_and_percentages) 


plt.pie(percentages, startangle=140) 

plt.axis('equal') 

legend_labels = [f'{label}: {percentage:.1f}%' for label, percentage in, 
ozip(labels, percentages) ] 

plt.legend(legend_labels, loc="best", fontsize='small', title=title,,, 
«bbox_to_anchor=(0.85, 0.5)) 

plt.title(title) 

plt.show() 


def draw_bar_chart(x, y, title): 
plt.bar(x, y) 
plt.title(title) 
plt.xticks(rotation='vertical') 
plt.show() 


def print_unique_data(column, title=None, boolean=False): 
if title is None: 
title = column.capitalize() 


df = get_unique_entries_as_df(con, column, "users") 


if boolean: 
df [column] = np.where(df[column] == 0, "No", "Yes") 


display (Markdown(f"##### {title}")) 

display (df) 

draw_bar_chart(df[column], df["count"], title) 
draw_pie_chart(df[column], df["count"], title) 


def analyze_and_print_categorical_relationship(index_col, column_col, title,,, 
~boolean=False) : 
query = f""" 

SELECT {index_col}, {column_col}, COUNT(*) AS count 
FROM users 
WHERE {index_col} IS NOT NULL 
AND {column_col} IS NOT NULL 
GROUP BY {index_col}, {column_col} 


df = pd.read_sql(query, con) 
if boolean: 
df [column_col] = np.where(df[column_col] == 0, "No", "Yes") 


df = df.pivot(index=index_col, columns=column_col, values='count') .fillna(0O) 
display (Markdown(f"#### {title}")) 

display (df .astype (int) ) 

df_percentage = df.div(df.sum(axis=1), axis=0) * 100 
pd.set_option('display.precision', 2) 

display (Markdown(f"#### {title} (percentages)")) 


display (df_percentage) 


df_percentage.plot(kind='bar', stacked=True) 

plt.title(title) 

plt .xlabel (index_col) 

plt.ylabel('Percentage') 

plt.xticks(rotation=90) 

plt.legend(title=column_col, bbox_to_anchor=(1.05, 1), loc='upper left') 
plt.show() 


3.2.2 The structure of the database 


We designed two tables, Users and Photos, to capture the complexity of the data. The Users table 
was created to store information about each user, with columns representing various attributes 
such as name, age, gender, location, and more. The primary key of the table was set to id, 
ensuring uniqueness and facilitating efficient data retrieval. We also included a column for about, 
which contained a brief description of the user. Additionally, we included columns to capture user 
preferences and habits, such as drinking, drugs, exercise, and smoking. The relationship_ status 
column was used to store information about the user’s current relationship status, while looking for 
captured their preferences for a romantic partner. We also included columns to store demographic 
information, such as education, height_cm, and orientation. 


The Photos table was created to store information about each user’s photos, with columns repre- 
senting the id of the photo, the uuid of the photo, and the user_id of the user who uploaded the 
photo. The user_id column was set as a foreign key, referencing the id column in the Users table. 
This ensured that each photo was associated with a unique user, enabling us to link the photo 
data back to the corresponding user profile. By designing these tables, we were able to create a 
structured database that could efficiently store and manage the large volume of data retrieved from 
Duolicious. 


CREATE TABLE IF NOT EXISTS Users ( 
id INT PRIMARY KEY, 
name TEXT NOT NULL, 
age INT NULL, 
gender TEXT NOT NULL, 
location TEXT NULL, 
occupation TEXT NULL, 
relationship_status TEXT NULL, 
about TEXT, 
drinking TEXT NULL, 
drugs INT NULL, 
exercise TEXT NULL, 
smoking TEXT NULL, 
looking for TEXT NULL, 
long_distance INT NULL, 
wants_kids INT NULL, 
education TEXT NULL, 
height_cm INT NULL, 
orientation TEXT NULL, 
religion TEXT NULL, 
star_sign TEXT NULL 

Ff 


CREATE TABLE IF NOT EXISTS Photos ( 
id INT PRIMARY KEY, 
uuid VARCHAR(64) NOT NULL, 
user_id INT NOT NULL, 
FOREIGN KEY (user_id) REFERENCES Users (id) 
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4 A first look at the data 


4.1 Gender 


The distribution of genders among Duolicious users revealed some fascinating insights into the 
demographics of the platform. Our analysis showed that the majority of users identified as men, 
with a staggering 47,804 individuals falling into this category. This was not surprising, given the 
platform’s reputation as a hub for heterosexual dating. However, it was intriguing to note that 
women comprised a significantly smaller proportion of the user base, with only 3,085 individuals 
identifying as such. This disparity may be attributed to a variety of factors, including differences 
in dating preferences, online behavior, or even the platform’s marketing strategies. In contrast, the 
number of users identifying as non-binary or gender-nonconforming was relatively small but still 
significant, with 2,011 individuals identifying as non-binary and 240 as agender. The presence of 
intersex individuals was also notable, with 36 users identifying as such. Furthermore, our analysis 
revealed a sizable number of users who identified as transgender, with 1,710 individuals identifying 
as trans women and 262 as trans men. Lastly, the “Other” category accounted for 553 users who 
did not identify with any of the aforementioned gender categories. Overall, these findings provide 
a nuanced understanding of the gender dynamics on Duolicious, highlighting the diversity of user 
identities and experiences on the platform. 


One of the most striking features of Duolicious’s user demographics is the skewed ratio of males to 
females. Unlike traditional dating apps, which typically boast a 3:1 or 4:1 ratio of males to females, 
Duolicious’s user base is overwhelmed by males, with a staggering 15:1 ratio. This means that 
for every one female user, there are 15 male users on the platform. This disparity is particularly 
noteworthy, as it deviates significantly from the norms observed in other online dating platforms. 
The reasons behind this skewed ratio are unclear, but it may be attributed to a variety of factors, 
including the platform’s marketing strategies, user acquisition tactics, or even the platform’s repu- 
tation and brand identity. Regardless of the cause, this imbalance has significant implications for 
the user experience, as it may affect the quality and quantity of matches, as well as the overall 
dynamics of online interactions. [ABKB16] 


print_unique_data("gender", title="Gender Distribution") 


Gender Distribution 


gender count 


Trans woman 1710 
Transgender 147 
Woman 3085 


0 Agender 240 
1 Intersex 36 
2 Man 47804 
3 Non-binary 2011 
4 Other 553 
5 Trans man 262 
6 
7 
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@amm Woman: 5.5% 
mmm Non-binary: 3.6% 
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MM Other: 1.0% 
@am Trans man: 0.5% 
mmm Agender: 0.4% 
M@@™ Transgender: 0.3% 
lam intersex: 0.1% 
4.2 Age 


The age distribution of Duolicious users is a fascinating aspect of the platform’s demographics, 
offering a unique glimpse into the preferences and behaviors of its user base. Our analysis revealed 
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a striking pattern, with the majority of users falling within the 18-25 age range. In fact, the 18-21 
age group accounts for nearly 40% of the total user base, with 6139 users aged 18, 5248 users aged 
19, and 5566 users aged 20. This suggests that Duolicious appeals strongly to younger generations, 
perhaps due to its reputation as a hook-up platform or its user-friendly interface. The 22-24 age 
group is also well-represented, with 5261 users aged 22, 5172 users aged 23, and 4744 users aged 
24. However, as we move into the older age ranges, the number of users declines precipitously. The 
25-29 age group represents a notable drop-off, with 3669 users aged 25, 2998 users aged 26, and 
2258 users aged 27. This trend continues, with the 30-39 age group accounting for a significantly 
smaller proportion of users, and the 40-49 age group representing an even smaller fraction. The 
50+ age group is scarce, with only a handful of users reporting ages above 50. Several age outliers 
are also notable, such as the 107-year-old user, the 116-year-old user with 52 entries, and the 117- 
year-old user with 56 entries, which may indicate errors in reporting or user manipulation. Overall, 
the age distribution of Duolicious users reveals a platform that skews strongly towards younger 
generations, with a significant decline in user numbers as age increases. 
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get_all_enteries_as_df(con, "age", "Users") 

df [df['age'] <= 60] 

plt.hist(df["age"], bins=100, color='skyblue', edgecolor='black') 
plt.title('Age Distribution') 

plt.xlabel('Age') 

plt.ylabel('Frequency') 
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plt.grid(axis='y', linestyle='--', alpha=0.7) 
plt.tight_layout () 
plt.show() 
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4.3. Location 


The geographic distribution of Duolicious users is a fascinating aspect of the platform’s demo- 
graphics, offering a unique glimpse into the global reach and diversity of its user base. Our analysis 
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revealed that the United States dominates the landscape, with a staggering 23,515 users hailing 
from the country. The United Kingdom comes in second, with 3,064 users, followed closely by 
Brazil with 2,343 users. Canada, Australia, and Germany round out the top six, with 2,173, 1,369, 
and 1,173 users, respectively. A notable trend emerges as we move down the list, with a clear 
concentration of users in Europe, North America, and South America. Many countries in these 
regions have a significant presence on the platform, with users from Spain, France, Poland, and 
Argentina, among others. However, as we move further down the list, the numbers begin to dwin- 
dle, with countries in Asia, Africa, and the Middle East making up a smaller proportion of the 
user base. Some notable outliers include Japan, with 285 users, and India, with 184 users. The 
fact that 10,193 users did not provide a location suggests that there may be a significant number 
of users who prefer to remain anonymous or private. This could be due to a variety of reasons, 
including cultural or geopolitical factors, or simply a desire for increased online privacy. Overall, 
the geographic distribution of Duolicious users reveals a platform that is truly global in scope, with 
a diverse user base that spans the globe. 


One of the most striking aspects of the geographic distribution of Duolicious users is the signif- 
icant presence of users from Eastern European countries, particularly Romania and Poland. It 
is surprising to see that Romania and Poland have a relatively high number of users, with 200 
and 753 users, respectively. This is notable because these countries are not typically associated 
with large-scale online dating platforms, and it suggests that Duolicious has managed to tap into 
a previously underserved market. The popularity of Duolicious in these countries may be due to 
a variety of factors, including cultural and linguistic ties, as well as a growing demand for online 
dating services in these regions. Regardless of the reason, the presence of so many users from 
Romania and Poland is a fascinating aspect of the platform’s demographics, and it highlights the 
global reach and diversity of the Duolicious user base. 


One of the challenges of analyzing the geographic distribution of Duolicious users is the sheer 
number of countries represented on the platform. With users hailing from over 150 countries, 
visualizing the data through graphs or maps becomes a daunting task. To make this task more 
manageable and the resulting visuals more interpretable, we decided to focus on the top 15 countries 
with the most users. This approach allowed us to highlight the most notable trends and patterns in 
the user base, providing a more nuanced understanding of the platform’s global reach and diversity. 
By doing so, we were able to maintain clarity and readability in our visualizations. 


from typing import Counter 

import geopandas as gpd 

import pandas as pd 

from mpl_toolkits.axes_gridl import make_axes_locatable 
from matplotlib.colors import LogNorm 


import warnings 
warnings .filterwarnings('ignore') 


cur = con.cursor() 

cur.execute('SELECT location FROM Users;') 

1 = cur.fetchall() 

countries = list(map(lambda x: x.split(",")[-1].replace(",", "").stripQ,, 
ofilter(lambda x: x is not None, map(lambda x: x[0], 1)))) 
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country_freq = dict (Counter(countries) ) 
country_freq["United States of America"] = country_freq.pop("United States") 


world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) 

world = world.merge( 
right=pd.DataFrame(list(country_freq.items()), columns=['name', 'users']), 
how='left', left_on='name', right_on='name' 


fig, ax = plt.subplots(1, 1, figsize=(9, 4)) 
divider = make_axes_locatable(ax) 
cax = divider.append_axes("right", size="5%/", pad=0.1) 
ymin, vmax = world['users'].min(), world['users'].max() 
world.plot(column='users', 

cmap='OrRd', 

ax=ax, 

legend=True, 

cax=cax, 

norm=LogNorm(vmin=vmin, vmax=vmax) , 

legend_kwds={'label': "Number of Users"}) 
ax.set_title('Number of Users by Country') 


plt.show() 


Number of Users by Country 
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4.4 Relationship status 


The intriguing world of online dating, where the eligibility of potential matches is often gauged by 
a few carefully curated profile details, including the all-important relationship status. In the case of 
Duolicious, a popular dating app, a snapshot of its user base reveals some fascinating insights. Of 
the thousands of profiles examined, a staggering 31,062 individuals, or roughly 93% of the total, have 
declared themselves as “Single”, implying that they are ready to mingle and potentially embark on 
a new romantic adventure. At the other end of the spectrum, 179 users are “Married”, and 157 are 
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“Engaged”, which may raise some eyebrows, but might also suggest that they’re seeking a little extra 
companionship or excitement outside of their existing relationships. Meanwhile, 180 individuals 
have identified as “Divorced”, perhaps looking to re-enter the dating scene and find love once again. 
A smaller, yet still significant, contingent of 289 users have marked themselves as “Widowed”, no 
doubt seeking comfort, companionship, or even a new sense of purpose. Rounding out the bunch are 
972 individuals who are “Seeing someone”, which could imply a casual, non-exclusive arrangement, 
and a catch-all “Other” category, which encompasses 724 users whose relationship status defies 
categorization or is too complex to be captured by these simple labels. As Duolicious continues to 
connect people from all walks of life, this relationship status data offers a captivating glimpse into 
the diverse, often complex, and ever-evolving world of online dating. 


print_unique_data("relationship_status", title="Relationship Status") 


Relationship Status 


relationship_status count 


Single 31062 
Widowed 289 


0 Divorced 180 
1 Engaged 157 
2 Married 179 
3 Other 724 
4 Seeing someone 972 
5 
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Relationship Status 


Relationship Status 
Single: 92.5% 

Seeing someone: 2.9% 
Other: 2.2% 

Widowed: 0.9% 
Divorced: 0.5% 
Married: 0.5% 
Engaged: 0.5% 


4.5 Religion 


A significant contingent of 9,666 individuals, or roughly 37.2% of the total, identify as “Agnostic”, 
indicating a willingness to question the existence of a higher power or a lack of certainty in their 
religious beliefs. Close behind are 6,249 “Atheists”, who reject the notion of a deity altogether. 
These two groups, comprising over 60% of the user base, highlight the prevalence of non-traditional 
or non-theistic beliefs in the Duolicious community. 


On the other hand, followers of more traditional faiths are also well-represented. Christians, with 
5,560 adherents, form the largest bloc of religiously affiliated users, while 350 individuals identify 
as Muslim, and 177 as Jewish. The presence of these faith groups underscores the app’s appeal 
to a broad cross-section of society. A scattering of users claim affiliation with Eastern religions, 
including 286 Buddhists and 72 Hindus, adding to the rich cultural tapestry of Duolicious. 


Rounding out the spectrum are 3,505 individuals who identify with “Other” religious beliefs or 
practices, which may include everything from paganism to spiritualism. This diverse group is a 
testament to the boundless variations in human spirituality and the app’s ability to attract users 
from the fringes as well as the mainstream. Even the ancient Persian religion of Zoroastrianism, 
with 143 adherents, finds representation within the Duolicious community. 


As users navigate the app’s profiles, they’ll encounter a multitude of perspectives, values, and 
belief systems. This diversity, in turn, fosters an environment of tolerance, open-mindedness, and 
mutual respect — essential ingredients for forging meaningful connections in the digital age. By 
embracing this kaleidoscope of religious beliefs, Duolicious creates a safe space for users to explore 
their differences and, perhaps, discover common ground with like-minded individuals. 


print_unique_data("religion") 


Religion 
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religion count 
Agnostic 9666 
Atheist 6249 
Buddhist 286 
Christian 5560 
Hindu 72 
Jewish 177 
Muslim 350 


Other 3505 
Zoroastrian 143 
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Hindu 
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Other 
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Religion 


Religion 
Agnostic: 37.2% 
Atheist: 24.0% 
Christian: 21.4% 
Other: 13.5% 
Muslim: 1.3% 
Buddhist: 1.1% 
Jewish: 0.7% 
Zoroastrian: 0.5% 
Hindu: 0.3% 


4.6 Height by gender 


Based on the dataset from a dating app, we can gain insight into the average heights of users 
by gender. For women, the mean height is approximately 164.18 cm, with a median of 164 cm. 
This indicates that half of the female users are shorter than 164 cm, while half are taller. The 
standard deviation of 3.71 cm is relatively close to the 2.2 inches (5.59 cm) reported by the Bureau 
[1], suggesting that the height distribution of women on the app is consistent with the general 
population. Notably, this similarity suggests that women on the app were generally honest about 
their heights, rather than exaggerating or underreporting them. 


In contrast, the average height for men on the app is significantly taller, with a mean of around 
179.76 cm and a median of 180 cm. This suggests that men on the app tend to be taller overall, 
with half of them above 180 cm in height. The standard deviation of 8.79 cm is larger than the 
2.5 inches (6.35 cm) reported by the Bureau [1], indicating a slightly wider range of heights among 
male users compared to the general population. However, the mean and median heights are still 
within a reasonable range, suggesting that men on the app also provided accurate self-reported 
heights, rather than significantly inflating or deflating them. [PSDSO16] 


It’s interesting to note the difference in height distributions between men and women on the app. 
While there is some overlap, the data suggests that men tend to be taller on average, with fewer men 
falling in the shorter range and more in the taller range compared to women. These findings may 
have implications for dating preferences and expectations, and could influence how users present 
themselves and interact with others on the app. Overall, the similarity between the app’s height 
distributions and the general population’s suggests that users took their profiles seriously and were 
truthful about their physical characteristics. 


[1] Bureau data: Adult male height is normally distributed with a standard deviation of about 2.5 
inches (6.35 cm) while female height is normally distributed with a standard deviation of about 2.2 
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inches (5.59 cm). 


import numpy as np 


def get_unique_entries_where( 
con: sqlite3.Connection, column: str, table: str, where: str 


cur = con.cursor() 
cur .execute( 
f"SELECT {column}, COUNT(*) as frequency FROM {table} where {where},, 
«GROUP BY {column}" 
) 
unique_entries = dict(cur.fetchall()) 
unique_entries = { 
k: v for k, v in unique_entries.items() if k is not None and v is not, 
oNone 
} 


return unique_entries 


def draw_height_graph(index: int, gender: str): 
ax = plt.subplot(1i, 2, index) 
d = get_unique_entries_where(con, "height_cm", "Users", f"gender =, 
>'{gender}'") 
d = {k: v / len(d) for k, v in d.items() if k >= 100 and k <= 220} 
values = np.repeat(list(d.keys()), list(d.values())) 
mean = np.mean(values) 
median = np.median(values) 
std_dev = np.std(values) 


ax.bar(d.keys(), d.values()) 

ax.set_title(f"{gender} heights") 

ax.errorbar(mean, max(d.values()) / 2, xerr=std_dev, fmt="0", ecolor="Red") 

ax.errorbar(median, max(d.values()) / 2, xerr=std_dev, fmt="o0",,, 
~ecolor="Green") 


df = get_all_enteries_as_df(con, "height_cm", "Users") 

plt.hist(df["height_cm"], bins= 50, range=(150, 220), color='skyblue',,, 
sedgecolor='black') 

plt.title('Height') 

plt.xlabel('Height') 

plt.ylabel('Frequency') 


plt.grid(axis='y', linestyle='--', alpha=0.7) 
plt.tight_layout () 
plt.show() 
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draw_height_graph(1, "Woman") 
draw_height_graph(2, "Man") 


plt.show() 
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4.7 Alcohol Consumption 


The Duolicious community’s attitude towards alcohol consumption is diverse, with users exhibiting 
a range of habits. A notable 26.7% of users, or 9,648 individuals, reported Never consuming alcohol, 
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suggesting a significant proportion of users who abstain from alcohol or lead a sober lifestyle. At the 
other end of the spectrum, 10.4% of users, or 3,753 individuals, reported Often consuming alcohol, 
indicating a sizable group that enjoys social drinking or incorporates alcohol into their regular 
routine. The majority of users, however, fall into the middle ground, with 62.9% of users, or 22,706 
individuals, reporting that they Sometimes consume alcohol, suggesting a flexible and moderate 
approach to drinking. This distribution of responses highlights the importance of considering 
individual differences in lifestyle choices and habits within the Duolicious community. 
[67]: print_unique_data("drinking", title="Alcohol Consumption") 


Alcohol Consumption 


drinking count 
0) Never 9648 
1 Often 3753 
2 Sometimes 22706 


Alcohol Consumption 
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15000 


10000 


5000 


Never 
Often 
Sometimes 
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Alcohol Consumption 


Alcohol Consumption 
Mmm Sometimes: 62.9% 
Mmm Never: 26.7% 
Mam Often: 10.4% 


4.8 Recreational Drug Use 


An examination of the prevalence of recreational drug use within the Duolicious community reveals 
a dichotomous distribution. A substantial majority of users, constituting 76.9% of the sample (n = 
26,279), reported abstaining from recreational drug use, indicating a propensity towards a drug-free 
lifestyle. Conversely, a notable 23.1% of users (n = 7,916) acknowledged engaging in recreational 
drug use, suggesting a significant presence of individuals who do not eschew substance use for 
recreational or relaxational purposes. 


This bifurcation underscores the importance of recognizing and accommodating individual differ- 
ences in lifestyle choices, including those that may be considered controversial or sensitive. By 
doing so, the Duolicious platform facilitates an environment of candor, trust, and mutual respect 
among its users, thereby fostering a unique opportunity for individuals to engage with others who 
share similar values and practices. 


print_unique_data("drugs", title="Drug Comsumption", boolean=True) 


Drug Comsumption 


drugs count 
0 No 26279 
1 Yes 7916 
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Drug Comsumption 
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Drug Comsumption 
ME No: 76.9% 
mmm Yes: 23.1% 


4.9 Smoking 


An analysis of the prevalence of tobacco use within the Duolicious community reveals a distinct 
dichotomy. A substantial majority of users, comprising 75.6% of the sample (n = 26,445), reported 
not engaging in tobacco use, indicating a strong inclination towards a smoke-free lifestyle. Con- 
versely, a notable 24.4% of users (n = 8,523) acknowledged using tobacco products, suggesting a 
significant presence of individuals who do not abstain from smoking. 


This binary distribution underscores the importance of recognizing and accommodating individual 
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differences in lifestyle choices, including those related to substance use. The co-occurrence of 
tobacco use with recreational drug use, as indicated by the presence of both habits in 8,523 users, 
further highlights the complexity of individuals’ relationships with substances. By acknowledging 
and accepting these differences, the Duolicious platform fosters an environment of openness, trust, 
and mutual respect among its users. 


print_unique_data("smoking") | 


Smoking 


smoking count 
No 26445 
1 Yes 8523 


Smoking 
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Smoking 


Smoking 
MMM No: 75.6% 
mmm Yes: 24.4% 


4.10 Exercise 


An examination of the physical activity habits of Duolicious users reveals a varied landscape. A 
notable 11.9% of users (n = 4,037) reported Never engaging in physical exercise, suggesting a 
sedentary lifestyle or limited opportunities for physical activity. In contrast, 33.3% of users (n = 
11,316) reported Often engaging in physical exercise, indicating a strong commitment to regular 
physical activity. The remaining 54.8% of users (n = 18,597) reported Sometimes engaging in 
physical exercise, suggesting a more flexible or sporadic approach to staying physically active. 


This distribution highlights the complexity of individual relationships with physical activity, with 
users exhibiting a range of habits and priorities. By acknowledging and accommodating these 
differences, the Duolicious platform fosters an environment of inclusivity and respect, where users 
can connect with others who share similar values and habits related to physical activity. 


print_unique_data("exercise" 


Exercise 


exercise count 
0) Never 4037 
1 Often 11316 
2 Sometimes 18597 
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Exercise 


Never 
Often 
Sometimes 


Exercise 


Exercise 
MM Sometimes: 54.8% 
@am Often: 33.3% 
mmm Never: 11.9% 


4.11 Relationship Goals 


An examination of the relationship goals of Duolicious users reveals a diverse range of aspirations. A 
notable 30% of users (n = 8469) are seeking Friendship, suggesting a focus on platonic connections 
and social bonding. In contrast, 44.2% of users (n = 12486) are looking for Long-term dating, 
indicating a desire for meaningful romantic partnerships. 
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A smaller but significant proportion of users, 9.6% (n = 2714), are seeking Marriage, highlighting 
a commitment to finding a lifelong partner. Finally, 16.2% of users (n = 4568) are interested in 
Short-term dating, suggesting a focus on casual romantic encounters. 


This distribution underscores the complexity of individuals’ relationship goals, with users exhibiting 
a range of priorities and expectations. By acknowledging and accommodating these differences, the 
Duolicious platform fosters an environment of openness and respect, where users can connect with 
like-minded individuals who share similar relationship aspirations. 


print_unique_data("looking for", title="Relationship Goals") 


Relationship Goals 


looking for count 
0 Friends 8469 
1 Long-term dating 12486 
2 Marriage 2714 
3 Short-term dating 4568 


Relationship Goals 


Friends 
Marriage 


Long-term dating 


Da 
£ 
om] 
© 
ne) 
o 
+ 
£ 
° 
£ 
a) 


24 


[72h 


Relationship Goals 


Relationship Goals 
@—™_ Long-term dating: 44.2% 
mmm Friends: 30.0% 

mmm Short-term dating: 16.2% 
MM Marriage: 9.6% 


4.12 Family Planning 


An examination of the family planning preferences of Duolicious users reveals a nearly even split 
between those who do and do not desire to have children. A notable 47.7% of users (n = 13028) 
indicated that they do not want to have kids, suggesting a preference for a child-free lifestyle or a 
focus on other personal goals. 


In contrast, 52.3% of users (n = 14303) expressed a desire to have children, highlighting the im- 
portance of family and parenthood in their lives. This distribution underscores the diversity of 
individuals’ values and priorities, with users exhibiting different attitudes towards parenthood. 


By acknowledging and respecting these differences, the Duolicious platform fosters an environment 
of inclusivity and understanding, where users can connect with others who share similar values and 
aspirations regarding family planning. 


print_unique_data("wants_kids", title="Wants Kids", boolean="True") 


Wants Kids 
wants_kids count 
No 13028 
1 Yes 14303 
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Wants Kids 


Wants Kids 


Wants Kids 
Mm Yes: 52.3% 
mmm No: 47.7% 


4.13 Orientation 


An examination of the sexual orientation of Duolicious users reveals a diverse and complex land- 
scape. A notable 67.2% of users (n = 23154) identified as Straight, suggesting a traditional het- 
erosexual orientation. However, a significant proportion of users, 16.9% (n = 5823), identified as 
Bisexual, highlighting the importance of recognizing and accepting non-binary attraction. 


Beyond these two categories, a range of other orientations were represented, including Pansexual 
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(7.5%, n = 2599), Lesbian (2.2%, n = 752), Gay (0.7%, n = 228), Queer (1.4%, n = 472), Asexual 
(0.7%, n = 238), Demisexual (1.3%, n = 451), and Other (2.1%, n = 738). This distribution 
underscores the importance of acknowledging and respecting the diversity of human sexuality, and 
the need for inclusive and accepting environments that recognize the validity of all orientations. 
By doing so, the Duolicious platform fosters a culture of openness, acceptance, and respect for 
individual differences. 


print_unique_data("orientation", title="Sexual Orientation") 


Sexual Orientation 


orientation count 


Pansexual 2599 
Queer 472 
Straight 23154 


0) Asexual 238 
1 Bisexual 5823 
2 Demisexual 451 
3 Gay 228 
4 Lesbian 752 
5 Other 738 
6 
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Sexual Orientation 


Sexual Orientation 
Straight: 67.2% 
Bisexual: 16.9% 
Pansexual: 7.5% 
Lesbian: 2.2% 
Other: 2.1% 
Queer: 1.4% 
Demisexual: 1.3% 
Asexual: 0.7% 
Gay: 0.7% 


5 A Deeper Look at the data 
5.0.1 Gender and Age 


[74]: def plot_gender_age_distribution(gender, color='blue'): 
# We chose to remove the ages over 60 because they are most likely fake 
df = pd.read_sql(f"SELECT age FROM users WHERE gender = '{gender}' AND age, 
«<= 60", con) 
plt.hist(df['age'], bins=100, color=color) 
plt.title(f'{gender.capitalize()} Age Distribution') 
plt.xlabel('Age') 
plt.ylabel('Frequency') 


plt.grid(axis='y', linestyle='--', alpha=0.7) 
plt.tight_layout() 
plt.show() 


plot_gender_age_distribution("Man") 
plot_gender_age_distribution("Woman", color="red") 
plot_gender_age_distribution("Trans man", color="orange") 
plot_gender_age_distribution("Trans woman", color="pink") 
plot_gender_age_distribution("Non-binary", color="yellow") 
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Non-binary Age Distribution 
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5.1 Gender and Sexual Orientation 


This data provides a fascinating look at the intersection of gender identity and sexual orientation. 
One of the most striking patterns to emerge is the significant variation in sexual orientation among 
individuals of different gender identities. 


One of the most notable disparities is between men and women. A staggering 77.05% of men 
identify as straight, while only 28.71% of women do. This suggests that women are more likely 
to identify with non-heterosexual orientations, with 39.64% of women identifying as bisexual and 
12.71% as lesbian. This difference could be attributed to a range of factors, including social and 
cultural influences, as well as differences in gender socialization. 


In contrast, individuals who identify as non-binary, transgender, or gender-nonconforming exhibit 
a much more diverse range of sexual orientations. A significant proportion of these individuals 
identify as bisexual, pansexual, or queer, highlighting the complexity and fluidity of their sexual 
identities. Trans men, in particular, stand out for their high rates of bisexuality (44.79%) and 
homosexuality (19.27%), while trans women are more likely to identify as lesbian (29.78%) or 
bisexual (28.99%). 


The data also reveals some interesting patterns among smaller gender identity groups. For example, 
intersex individuals are highly likely to identify as pansexual (42.11%), while agender individuals 
are more likely to identify as asexual (9.04%) or pansexual (28.31%). These findings underscore the 
importance of recognizing and respecting the diversity of gender identities and sexual orientations 
within our communities. By doing so, we can work towards creating a more inclusive and accepting 
society for all individuals, regardless of their gender identity or sexual orientation. 


analyze_and_print_categorical_relationship("gender", "orientation", "Gender and, 
~Sexual Orientation") 


Gender and Sexual Orientation 
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orientation Asexual Bisexual Demisexual Gay Lesbian Other Pansexual \ 


gender 

Agender 15 33 8 6 17 11 47 
Intersex 1 5 1 0 0 1 8 
Man 122 4058 298 152 14 476 1497 
Non-binary 38 425 63 18 TA 57 448 
Other 8 56 17 2 17 68 66 
Trans man 6 86 2 37 0 7 15 
Trans woman 19 367 25 6 377 43 325 
Transgender 2 35 3 2 10 2 27 
Woman 27 758 34 5 243 73 166 


orientation Queer Straight 


gender 

Agender 25 4 
Intersex 3 0) 
Man 88 22514 
Non-binary 139 37 
Other 32 20 
Trans man 32 7 
Trans woman 82 22 
Transgender 14 1 
Woman 57 549 


Gender and Sexual Orientation (percentages) 


orientation Asexual Bisexual Demisexual Gay Lesbian Other Pansexual \ 
gender 

Agender 9.04 19.88 4.82 3.61 10.24 6.63 28.31 
Intersex 5.26 26.32 5.26 0.00 0.00 5.26 42.11 
Man 0.42 13.89 1.02 0.52 0.05 1.63 5.12 
Non-binary 2.93 32.72 4.85 1.39 5.70 4.39 34.49 
Other 2.80 19.58 5.94 0.70 5.94 23.78 23.08 
Trans man 3.12 44.79 1.04 19.27 0.00 3.65 7.81 
Trans woman 1.50 28.99 1.97 0.47 29.78 3.40 25.67 
Transgender 2.08 36.46 3.12 2.08 10.42 2.08 28.12 
Woman 1.41 39.64 1.78 0.26 12.71 3.82 8.68 


orientation Queer Straight 


gender 

Agender 15.06 2.41 
Intersex 15.79 0.00 
Man 0.30 77.05 
Non-binary 10.70 2.85 
Other 11.19 6.99 
Trans man 16.67 3.65 
Trans woman 6.48 1.74 
Transgender 14.58 1.04 
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5.2 Gender and Religion 


An examination of religious affiliations within the Duolicious community reveals notable differences 
based on gender. 


Men within the community tend to exhibit a higher propensity towards Christian affiliation com- 
pared to other gender identities. While Christianity isn’t the most prevalent belief among men, with 
5,215 adherents, it still holds significance. However, a substantial portion of male users identifies as 
Agnostic (8,158) or Atheist (4,983), suggesting a diverse range of beliefs within this demographic. 


Similarly, women also display a tendency towards Christian affiliation, with 245 adherents. How- 
ever, like men, they also demonstrate significant engagement with non-theistic beliefs, including 
Agnosticism (459) and Atheism (397). Women exhibit a diverse array of religious identities, en- 
gaging with various faiths such as Buddhism, Judaism, Islam, and Hinduism. 


Non-binary individuals and those identifying with other genders showcase a wide spectrum of 
religious affiliations. While Agnosticism and Atheism are prevalent, these gender identities also 
show engagement with various faiths, including Christianity, Buddhism, Judaism, and others. This 
highlights the diverse spiritual journeys undertaken by individuals across different gender identities 
within the Duolicious community. 


While non-theistic beliefs, such as Agnosticism and Atheism, remain prevalent across all gender 
identities, Christianity holds a notable presence among both men and women, and to a lesser 
extent among other gender identities. The data underscores the rich tapestry of religious beliefs 
embraced by users across the platform. Gender nuances play a role in shaping these affiliations, but 
ultimately, Duolicious fosters an environment where diverse spiritual perspectives are celebrated 
and respected. 
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[76]: analyze_and_print_categorical_relationship("gender", "religion", "Religion by,, 
ogender identity") 


Religion by gender identity 


religion Agnostic Atheist Buddhist Christian Hindu Jewish Muslim \ 
gender 

Agender 48 44 4 6 0 2 2 
Intersex 2 4 1 2 0 0 0 
Man 8158 4983 224 5215 62 130 311 
Non-binary 430 341 16 41 1 7 9 
Other 84 53 1 9 1 7 3 
Trans man 63 48 4 6 0 3 0 
Trans woman 392 367 12 31 2 12 3 
Transgender 30 12 0 5 0 3 1 
Woman 459 397 24 245 6 13 21 
religion Other Zoroastrian 

gender 

Agender 26 1 

Intersex 3 0 

Man 2772 115 

Non-binary 184 7 

Other 72 4 

Trans man 29 0 

Trans woman 161 2 

Transgender 17 0 

Woman 241 14 


Religion by gender identity (percentages) 


religion Agnostic Atheist Buddhist Christian Hindu Jewish Muslim \ 
gender 

Agender 36.09 33.08 3.01 4.51 0.00 1.50 1.50 
Intersex 16.67 33.33 8.33 16.67 0.00 0.00 0.00 
Man 37.13 22.68 1.02 23.74 0.28 0.59 1.42 
Non-binary 41.51 32.92 1.54 3.96 0.10 0.68 0.87 
Other 35.90 22.65 0.43 3.85 0.43 2.99 1.28 
Trans man 41.18 31.37 2.61 3.92 0.00 1.96 0.00 
Trans woman 39.92 37.37 1.22 3.16 0.20 1.22 0.31 
Transgender 44.12 17.65 0.00 7.35 0.00 4.41 1.47 
Woman 32.32 27.96 1.69 17.25 0.42 0.92 1.48 
religion Other Zoroastrian 

gender 

Agender 19.55 0.75 

Intersex 25.00 0.00 

Man 12.62 0.52 
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Non-binary 17.76 0.68 
Other 30.77 1.71 
Trans man 18.95 0.00 
Trans woman 16.40 0.20 
Transgender 25.00 0.00 
Woman 16.97 0.99 
Religion by gender identity 
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5.3 Gender and Alcohol Consumption 


An analysis of the drinking habits of users on the Duolicious dating app reveals some interesting 
trends. When it comes to gender, men are the most likely to drink often, with over 30% of them 
reporting frequent drinking. On the other hand, agender individuals are more likely to never drink, 
with over 40% of them abstaining from alcohol. Non-binary individuals and trans women also 
tend to drink less, with around 20% of them reporting frequent drinking. In contrast, trans men 
and individuals who identify as “other” tend to drink more, with around 30% of them reporting 
frequent drinking. Women are the least likely to drink often, with only around 10% of them 
reporting frequent drinking. Overall, these findings suggest that gender plays a significant role in 
shaping individuals’ drinking habits on the Duolicious platform. 


analyze_and_print_categorical_relationship("gender", "drinking", "Gender and, 
~Alcohol Consumption") 


Gender and Alcohol Consumption 


drinking Never Often Sometimes 
gender 

Agender 56 20 87 
Intersex 4 1 13 
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Gender and Alcohol Consumption (percentages) 
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5.4 Gender and Drug usage 
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An examination of drug use among users on the Duolicious dating app reveals some notable patterns. 
Men are the largest group to report both using and not using drugs, with around 20% of them 
reporting drug use. Non-binary individuals, trans women, and those who identify as “other” have 
higher rates of drug use, with around 40-50% of them reporting drug use. In contrast, agender 
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individuals, intersex individuals, and women are less likely to report drug use, with around 30% or 
less of them doing so. Trans men and transgender individuals fall somewhere in the middle, with 
around 35-40% of them reporting drug use. These findings suggest that gender identity may play a 
role in shaping individuals’ attitudes and behaviors towards drug use on the Duolicious platform. 


analyze_and_print_categorical_relationship("gender", "drugs", "Gender and Drug), 
~Usage", boolean=True) 


Gender and Drug Usage 


drugs No Yes 
gender 

Agender 109 46 
Intersex 11 8 
Man 22957 6028 
Non-binary 744 542 
Other 192 112 
Trans man 119 76 
Trans woman 678 533 
Transgender 49 44 
Woman 1420 527 


Gender and Drug Usage (percentages) 


drugs No Yes 
gender 

Agender 70.32 29.68 
Intersex 57.89 42.11 
Man 79.20 20.80 
Non-binary 57.85 42.15 
Other 63.16 36.84 


Trans man 61.03 38.97 
Trans woman 55.99 44.01 
Transgender 52.69 47.31 
Woman 72.93 27.07 
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5.5 Gender and Smoking 


A examination of smoking habits across different gender identities reveals intriguing patterns. 
Among Agender individuals, 37.27% smoke, while Intersex individuals have the lowest smoking 
rate at 16.67%. Men have a smoking rate of 23.25%, similar to Women, who have a rate of 29.33%. 
Non-binary individuals have a smoking rate of 33.51%, while Other genders have a rate of 29.43%. 
Trans men have a smoking rate of 37.70%, similar to Agender individuals, while Trans women have 
a lower rate of 28.04%. Finally, Transgender individuals as a whole have a smoking rate of 35.42%. 
These findings highlight the complexities of smoking habits across different gender identities, and 
may inform targeted interventions and health promotion strategies. 


analyze_and_print_categorical_relationship("gender", "smoking", "Gender and, 
~Smoking") 


Gender and Smoking 


smoking No Yes 
gender 

Agender 101 60 
Intersex 15 3 
Man 22772 6897 
Non-binary 877 = =442 
Other 211 88 
Trans man 119 72 
Trans woman 888 346 
Transgender 62 34 
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Woman 1400 581 


Gender and Smoking (percentages) 


smoking No Yes 
gender 

Agender 62.73 37.27 
Intersex 83.33 16.67 
Man 76.75 23.25 
Non-binary 66.49 33.51 
Other 70.57 29.43 


Trans man 62.30 37.70 
Trans woman 71.96 28.04 
Transgender 64.58 35.42 


Woman 70.67 29.33 
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5.6 Gender and Exercise 


An examination of physical activity patterns across different gender identities reveals diverse habits. 
Among Agender individuals, 62.42% exercise sometimes, while 24.16% never exercise. Intersex 
individuals have a unique pattern, with 31.25% never exercising, and an equal percentage exercising 
often. Men have a high rate of regular exercise, with 35.84% exercising often, while 54.39% exercise 
sometimes. Non-binary individuals exercise sometimes at a rate of 58.71%, while 20.96% never 
exercise. Other genders have a similar pattern, with 54.42% exercising sometimes, and 25.44% 
never exercising. Trans men exercise sometimes at a rate of 58.01%, while Trans women have a 
higher rate of never exercising, at 29.86%. Finally, Transgender individuals as a whole have a 
similar pattern, with 58.89% exercising sometimes, and 22.22% never exercising. These findings 
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highlight the complexity of physical activity patterns across different gender identities, and may 
inform targeted health promotion strategies. 


[80]: analyze_and_print_categorical_relationship("gender", "exercise", "Gender and, 
oExercise") 


Gender and Exercise 


exercise Never Often Sometimes 
gender 

Agender 36 20 93 
Intersex 5 5 6 
Man 2831 10383 15760 
Non-binary 266 258 745 
Other 72 57 154 
Trans man 42 34 105 
Trans woman 352 140 687 
Transgender 20 17 53 
Woman 413 402 994 


Gender and Exercise (percentages) 


exercise Never Often Sometimes 
gender 

Agender 24.16 13.42 62.42 
Intersex 31.25 31.25 37.50 
Man 9.77 35.84 54.39 
Non-binary 20.96 20.33 58.71 
Other 25.44 20.14 54.42 
Trans man 23.20 18.78 58.01 
Trans woman 29.86 11.87 58.27 
Transgender 22.22 18.89 58.89 
Woman 22.83 22.22 54.95 
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5.7 Gender and Relationship Goals 


This data from a dating app provides insight into the romantic preferences of users across different 
gender identities. The majority of users, including men and women, are seeking long-term dating 
relationships, with a significant proportion also open to short-term dating. 
binary individuals are more likely to be seeking friendships, while agender individuals are more 
open to short-term dating. Transgender individuals, including trans men and trans women, are 
also diverse in their relationship goals, with some seeking long-term relationships and others open to 
shorter-term connections. Overall, the data highlights the complexity and individuality of people’s 
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romantic preferences, regardless of their gender identity. [HLD20] 


analyze_and_print_categorical_relationship("gender", "looking for", 


oRelationship Goals") 


Gender and Relationship Goals 


looking for 
gender 
Agender 
Intersex 
Man 
Non-binary 
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Trans man 
Trans woman 
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Gender and Relationship Goals (percentages) 


looking for Friends Long-term dating Marriage Short-term dating 


gender 
Agender 51.16 28.68 6.98 13.18 
Intersex 38.89 22.22 16.67 22.22 
Man 26.64 47 .38 10.13 15.85 
Non-binary 44.05 28.57 3.11 24.27 
Other 51.19 20.24 13.89 14.68 
Trans man 59.64 19.28 4.82 16.27 
Trans woman 42.87 31.90 4.59 20.64 
Transgender 53.93 22.47 5.62 17.98 
Woman 52.77 24.55 9.66 13.02 
Gender and Relationship Goals 
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5.8 Gender and Kids (percentages) 


This data reveals the varying attitudes towards having children among individuals of different 
gender identities. A significant majority of agender, non-binary, and trans man individuals (over 
80%) do not want to have kids, while a majority of men (57.52%) do want to have children. Women 
and individuals who identify as “other” fall somewhere in between, with around 37-40% wanting 
to have kids. Intersex individuals are almost evenly split on the issue, with 66.67% not wanting 
kids and 33.33% wanting them. Trans women and transgender individuals also have a majority 
who do not want to have kids, although the proportions are slightly lower than for agender and 
non-binary individuals. These findings highlight the diversity of opinions on parenthood within 
different gender identity groups. 


This data reveals a significant disparity in attitudes towards having children between men and 
women. While a majority of men (57.52%) want to have kids, only about 37.79% of women share 
the same desire. This 20-point gap highlights a notable difference in reproductive goals between 


42 


the two genders. In contrast, other gender identities exhibit more varied attitudes towards having 
children, with some groups, such as agender and non-binary individuals, showing a strong preference 
against having kids. However, the stark difference between men and women stands out, suggesting 
that societal expectations, cultural norms, or other factors may be playing a role in shaping these 
differing aspirations. [FNR22] 


[82]: analyze_and_print_categorical_relationship("gender", "“wants_kids", "Gender and, 
Kids", boolean=True) 


Gender and Kids 


wants_kids No Yes 
gender 

Agender 110 27 
Intersex 10 5 
Man 9743 13192 
Non-binary 907 187 
Other 177 67 
Trans man 151 21 
Trans woman 865 178 
Transgender 64 18 
Woman 1001 608 


Gender and Kids (percentages) 


wants_kids No Yes 
gender 

Agender 80.29 19.71 
Intersex 66.67 33.33 
Man 42.48 57.52 
Non-binary 82.91 17.09 
Other 72.54 27.46 


Trans man 87.79 12.21 
Trans woman 82.93 17.07 
Transgender 78.05 21.95 
Woman 62.21 37.79 
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6 Conclusion 


In conclusion, this analysis has provided a comprehensive examination of user data from the Duoli- 
cious platform, highlighting notable trends, patterns, and outliers. The identification and handling 
of anomalous data points, such as unrealistic height values, have ensured the accuracy of statistical 
calculations and visualizations. The analysis has also revealed significant insights into the gender 
and sexual orientation distribution among users, including a highly skewed ratio of men to women 
and a low proportion of straight women. Furthermore, inconsistencies in relationship status data 
have been identified, pointing to potential issues with data collection or categorization. 


The results of this analysis underscore the importance of data analysis in understanding user de- 
mographics and highlight the need for careful data handling and quality control. Despite these 
challenges, the analysis has provided valuable insights into the user base of the Duolicious plat- 
form, shedding light on its global reach, diversity, gender dynamics, sexual orientation distribution, 
and relationship status inconsistencies. These findings have the potential to inform platform devel- 
opment, marketing strategies, and user experience improvements, ultimately enhancing the overall 
value of the Duolicious platform. 
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